# Zero-shot depth estimation
Depthpro Hf
DepthPro is a foundational model for zero-shot metric monocular depth estimation, capable of generating high-resolution, high-precision depth maps.
3D Vision
Transformers

D
apple
13.96k
52
Depthpro Mixin
A zero-shot monocular ranging depth estimation foundation model capable of synthesizing high-resolution depth maps with unparalleled sharpness and high-frequency details
3D Vision
D
apple
17
8
Depthpro
Zero-shot monocular ranging depth estimation foundation model capable of synthesizing high-resolution depth maps with absolute scale prediction independent of camera intrinsics
3D Vision
D
apple
1,986
424
Stable Diffusion E2e Ft Depth
Apache-2.0
This is a monocular depth estimation model based on the Apache-2.0 license, suitable for zero-shot depth estimation tasks in wild scenes.
3D Vision
S
GonzaloMG
28
1
Depth Anything Large Hf
Apache-2.0
Depth Anything is a depth estimation model based on the DPT architecture and DINOv2 backbone network, trained on approximately 62 million images, achieving state-of-the-art results in both relative and absolute depth estimation tasks.
3D Vision
Transformers

D
LiheYoung
147.17k
51
Depth Anything Base Hf
Apache-2.0
Depth Anything is a depth estimation model based on the DPT architecture and DINOv2 backbone network, trained on approximately 62 million images, achieving state-of-the-art performance in zero-shot depth estimation.
3D Vision
Transformers

D
LiheYoung
4,101
10
Depth Anything Small Hf
Apache-2.0
Depth Anything is a depth estimation model based on the DPT architecture, utilizing the DINOv2 backbone network. It was trained on approximately 62 million images and excels in both relative and absolute depth estimation tasks.
3D Vision
Transformers

D
LiheYoung
97.89k
29
Marigold Depth V1 0
Apache-2.0
A monocular image depth estimation model fine-tuned based on Stable Diffusion, featuring affine invariance for depth prediction in natural scenes
3D Vision English
M
prs-eth
92.50k
127
Dpt Beit Large 512
MIT
A monocular depth estimation model based on BEiT Transformer, capable of inferring fine depth information from a single image
3D Vision
Transformers

D
Intel
2,794
8
Dpt Large
Apache-2.0
A monocular depth estimation model based on Vision Transformer (ViT), trained on 1.4 million images, suitable for zero-shot depth prediction tasks.
3D Vision
Transformers

D
Intel
364.62k
187
Featured Recommended AI Models